A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares
نویسندگان
چکیده
We consider statistical aspects of solving large-scale least-squares (LS) problems using randomized sketching algorithms. For a LS problem with input data (X,Y ) ∈ Rn×p × R, where n and p are both large and n p, sketching algorithms use a “sketching matrix,” S ∈ Rr×n, where r n, e.g., a matrix representing the process of random sampling or random projection. Then, rather than solving the LS problem using the full data (X,Y ), sketching algorithms solve the LS problem using only the “sketched data” (SX,SY ) ∈ Rr×p × R. Prior work has typically adopted an algorithmic perspective, in that it has made no statistical assumptions on the input X and Y , and instead it has assumed that the data (X,Y ) are fixed and worst-case. In this paper, we adopt a statistical perspective, and we consider the mean-squared error performance of randomized sketching algorithms, when data (X,Y ) are generated according to a statistical linear model Y = Xβ + , where is a noise process. To do this, we first develop a framework for assessing, in a unified manner, algorithmic and statistical aspects of randomized sketching methods. We then consider the statistical predicition efficiency (SPE) and the statistical residual efficiency (SRE) of the sketched LS estimator; and we use our framework to provide results for several types of random projection and random sampling sketching algorithms. Among other results, we show that the SRE can be bounded when p . r n but that the SPE typically requires the sample size r to be substantially larger. Our theoretical results reveal that, depending on the specifics of the situation, leverage-based sampling methods can perform as well as or better than projection methods. Our empirical results reveal that when r is only slightly greater than p and much less than n, projection-based methods out-perform sampling-based methods, but as r grows, sampling methods start to out-perform projection methods.
منابع مشابه
Statistical and Algorithmic Perspectives on Randomized Sketching for Ordinary Least-Squares
We consider statistical and algorithmic aspects of solving large-scale least-squares (LS) problems using randomized sketching algorithms. Prior results show that, from an algorithmic perspective, when using sketching matrices constructed from random projections and leverage-score sampling, if the number of samples r much smaller than the original sample size n, then the worst-case (WC) error is...
متن کاملFast and Guaranteed Tensor Decomposition via Sketching
Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent variable models and in data mining. In this paper, we propose fast and randomized tensor CP decomposition algorithms based on sketching. We build on the idea of count sketches, but introduce many novel ideas which are unique to tensors. We develop novel methods for randomized computation of tenso...
متن کاملIterative Hessian Sketch: Fast and Accurate Solution Approximation for Constrained Least-Squares
We study randomized sketching methods for approximately solving least-squares problem with a general convex constraint. The quality of a least-squares approximation can be assessed in different ways: either in terms of the value of the quadratic objective function (cost approximation), or in terms of some distance measure between the approximate minimizer and the true minimizer (solution approx...
متن کاملSketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging
We address the statistical and optimization impacts of using classical sketch versus Hessian sketch to solve approximately the Matrix Ridge Regression (MRR) problem. Prior research has considered the effects of classical sketch on least squares regression (LSR), a strictly simpler problem. We establish that classical sketch has a similar effect upon the optimization properties of MRR as it does...
متن کاملNear Optimal Sketching of Low-Rank Tensor Regression
We study the least squares regression problem min Θ∈S D,R ‖AΘ− b‖2, where S D,R is the set of Θ for which Θ = ∑R r=1θ (r) 1 ◦ · · · ◦ θ (r) D for vectors θ (r) d ∈ Rpd for all r ∈ [R] and d ∈ [D], and ◦ denotes the outer product of vectors. That is, Θ is a low-dimensional, lowrank tensor. This is motivated by the fact that the number of parameters in Θ is only R ·Dd=1pd , which is significantly...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 17 شماره
صفحات -
تاریخ انتشار 2016